Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: migrate from vue/compiler-dom to parse5 #9678

Merged
merged 10 commits into from
Aug 24, 2022
Merged

Conversation

patak-dev
Copy link
Member

@patak-dev patak-dev commented Aug 15, 2022

Description

Fixes #5966
Fixes #5603

Vite has been using vue/compiler-dom to process HTML (for example to process inlined URLs). There are a few issues that are out of the scope of vue/compiler-dom, and although we could extend it, we discussed with the Team that migrating to a standard compliant HTML parser is better for the future.

This PR migrates to parse5, that is battle tested library used by many other projects like jsdom, Angular, and Lit.

I decided to keep the current approach of parsing and then use MagicString to do the changes. One difference is that the code locations for attrs are in the main node instead of in the attr token itself, and the info for attrs is the whole attr (foo="value") instead of the value as we had with compiler/dom. The rest is basically the same.

Remaining tasks:

  • Bundle parse5
  • Should we issue warnings for the parse5 errors we are ignoring? I think we should for duplicate attributes. For the other two, our own playgrounds have these issues.
    switch (parserError.code) {
      case 'missing-doctype':
        // ignore missing DOCTYPE
        return
      case 'abandoned-head-element-child':
        // Accept elements without closing tag in <head>
        return
      case 'duplicate-attribute':
        // Accept duplicate attributes #9566
        // The first attribute is used, browsers silently ignore duplicates
        return
    }

For the future:

  • We could expose parse5 to Vite users, there is a lot of value in having a standard parser for plugin and framework authors.

What is the purpose of this pull request?

  • Bug fix
  • New Feature
  • Documentation update
  • Other

Before submitting the PR, please make sure you do the following

  • Read the Contributing Guidelines.
  • Read the Pull Request Guidelines and follow the Commit Convention.
  • Check that there isn't already a PR that solves the problem the same way to avoid creating a duplicate.
  • Provide a description in this PR that addresses what the PR is solving, or reference the issue that it solves (e.g. fixes #123).
  • Ideally, include relevant tests that fail without this PR but pass with it.

@patak-dev
Copy link
Member Author

Vite Ecosystem CI is green for this PR (except for known glitches being solved on some projects): https://github.com/vitejs/vite-ecosystem-ci/actions/runs/2860308763

About issuing a warning for duplicate attributes, @surma mentioned that it could be noisy. I agree that without a way to turn these off, it may be bothersome if you can't fix your tooling.
I think we could move forward with this PR without warnings (to be merged in Vite 3.1), and then work on a feature to show warnings and allow users to disable them. We could even allow the user to ignore other parsing errors? Maybe html.ignoreErrors: [...].

@patak-dev patak-dev added feat: html p3-significant High priority enhancement (priority) labels Aug 15, 2022
@patak-dev patak-dev added this to the 3.1 milestone Aug 15, 2022
playground/html/index.html Outdated Show resolved Hide resolved
antfu
antfu previously approved these changes Aug 16, 2022
@bluwy
Copy link
Member

bluwy commented Aug 16, 2022

  • Should we issue warnings for the parse5 errors we are ignoring? I think we should for duplicate attributes. For the other two, our own playgrounds have these issues.

I think we can skip all of them if we want to be lenient with the html being passed. Otherwise we should report all the warnings instead as it could be confusing if we only report some of it.

@patak-dev patak-dev requested a review from bluwy August 22, 2022 20:43
bluwy
bluwy previously approved these changes Aug 23, 2022
Co-authored-by: Shinigami <chrissi92@hotmail.de>
Shinigami92
Shinigami92 previously approved these changes Aug 23, 2022
bluwy
bluwy previously approved these changes Aug 23, 2022
packages/vite/src/node/plugins/html.ts Outdated Show resolved Hide resolved
packages/vite/src/node/plugins/html.ts Show resolved Hide resolved
packages/vite/src/node/plugins/html.ts Outdated Show resolved Hide resolved
packages/vite/src/node/plugins/html.ts Show resolved Hide resolved
packages/vite/src/node/plugins/html.ts Show resolved Hide resolved
Co-authored-by: 翠 / green <green@sapphi.red>
@patak-dev patak-dev dismissed stale reviews from bluwy and Shinigami92 via 21e18cc August 23, 2022 14:18
Co-authored-by: 翠 / green <green@sapphi.red>
@patak-dev
Copy link
Member Author

Thanks a lot for the review @sapphi-red! I now recall why I didn't add all these guards. If you check in the https://astexplorer.net/#/1CHlCXc4n4, you'll see that when there are duplicated attrs, parse5 will already only have the first one in the attr array. Maybe I'm not looking at this correctly? Did you hit an issue with duplicates?

@sapphi-red
Copy link
Member

Oh, I wasn't aware of that. Sorry. I should have checked this on astexplorer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat: html p3-significant High priority enhancement (priority)
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

HTML Parser fails if you have the same attribute twice Support for optional (closing) tags
5 participants